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9.1 Central Limit Theorem: Introduction TCC 
This module provides a brief introduction to the Central Limit Theorem. 


Student Learning Outcomes 
By the end of this chapter, the student should be able to: 


e Recognize the Central Limit Theorem problems. 

¢ Classify continuous word problems by their distributions. 
e Apply and interpret the Central Limit Theorem for Means. 
e Apply and interpret the Central Limit Theorem for Sums. 


Introduction 


Why are we so concerned with means? Two reasons are that they give us a 
middle ground for comparison and they are easy to calculate. In this 
chapter, you will study means and the Central Limit Theorem. 


The Central Limit Theorem (CLT for short) is one of the most powerful 
and useful ideas in all of statistics. Both alternatives are concerned with 
drawing finite samples of size n from a population with a known mean, pL, 
and a known standard deviation, o. The first alternative says that if we 
collect samples of size n and n is "large enough," calculate each sample's 
mean, and create a histogram of those means, then the resulting histogram 
will tend to have an approximate normal bell shape. The second alternative 
says that if we again collect samples of size n that are "large enough," 
calculate the sum of each sample and create a histogram, then the resulting 
histogram will again tend to have a normal bell-shape. 


In either case, it does not matter what the distribution of the original 
population is, or whether you even need to know it. The important fact 
is that the sample means and the sums tend to follow the normal 
distribution. And, the rest you will learn in this chapter. 


The size of the sample, n, that is required in order to be to be ‘large enough’ 
depends on the original population from which the samples are drawn. If 
the original population is far from normal then more observations are 


needed for the sample means or the sample sums to be normal. Sampling is 
done with replacement. 


Optional Collaborative Classroom Activity 


Do the following example in class: Suppose 8 of you roll 1 fair die 10 
times, 7 of you roll 2 fair dice 10 times, 9 of you roll 5 fair dice 10 times, 
and 11 of you roll 10 fair dice 10 times. 


Each time a person rolls more than one die, he/she calculates the sample 
mean of the faces showing. For example, one person might roll 5 fair dice 
and get a 2, 2, 3, 4, 6 on one roll. 


The mean is The 3.4 is one mean when 5 fair dice 


242434446 _ 9 4 
4, 
are rolled. This same person would roll the 5 dice 9 more times and 


calculate 9 more means for a total of 10 means. 


Your instructor will pass out the dice to several people as described above. 
Roll your dice 10 times. For each roll, record the faces and find the mean. 
Round to the nearest 0.5. 


Your instructor (and possibly you) will produce one graph (it might be a 
histogram) for 1 die, one graph for 2 dice, one graph for 5 dice, and one 
graph for 10 dice. Since the "mean" when you roll one die, is just the face 
on the die, what distribution do these means appear to be representing? 


Draw the graph for the means using 2 dice. Do the sample means show 
any kind of pattern? 


Draw the graph for the means using 5 dice. Do you see any pattern 
emerging? 


Finally, draw the graph for the means using 10 dice. Do you see any 
pattern to the graph? What can you conclude as you increase the number of 
dice? 


As the number of dice rolled increases from 1 to 2 to 5 to 10, the following 
is happening: 


1. The mean of the sample means remains approximately the same. 

2. The spread of the sample means (the standard deviation of the sample 
means) gets smaller. 

3. The graph appears steeper and thinner. 


You have just demonstrated the Central Limit Theorem (CLT). 


The Central Limit Theorem tells you that as you increase the number of 
dice, the sample means tend toward a normal distribution (the 
sampling distribution). 


Glossary 


Average 
A number that describes the central tendency of the data. There are a 
number of specialized averages, including the arithmetic mean, 
weighted mean, median, mode, and geometric mean. 


Central Limit Theorem 
Given a random variable (RV) with known mean pu and known 
standard deviation o. We are sampling with size n and we are 
interested in two new RVs - the sample mean, X, and the sample sum, 
3/X. If the size n of the sample is sufficiently large, then X ~ 


N(n, <) and XX ~ N(np, no). If the size n of the sample is 


sufficiently large, then the distribution of the sample means and the 
distribution of the sample sums will approximate a normal distribution 
regardless of the shape of the population. The mean of the sample 
means will equal the population mean and the mean of the sample 
sums will equal n times the population ea The standard deviation 
of the distribution of the sample means, Tn , is called the standard 


error of the mean. 


9.2 Central Limit Theorem: Central Limit Theorem for Sample Means TCC 


Suppose X is a random variable with a distribution that may be known or 
unknown (it can be any distribution). Using a subscript that matches the 
random variable, suppose: 


° ajx = the mean of X 
e boy = the standard deviation of X 


If you draw random samples of size n, then as n increases, the random 


variable X which consists of sample means, tends to be normally 
distributed and 


X~ N(x, ) 


The Central Limit Theorem for Sample Means says that if you keep 
drawing larger and larger samples (like rolling 1, 2, 5, and, finally, 10 dice) 
and calculating their means the sample means form their own normal 
distribution (the sampling distribution). The normal distribution has the 
Same mean as the original distribution and a variance that equals the 
original variance divided by n, the sample size. n is the number of values 
that are averaged together not the number of times the experiment is done. 


To put it more formally, if you draw random samples of size n,the 
distribution of the random variable X , which consists of sample means, is 
called the sampling distribution of the mean. The sampling distribution of 
the mean approaches a normal distribution as n, the sample size, increases. 


The random variable X has a different z-score associated with it than the 
random variable X. x is the value of X in one sample. 
Equation: 


wv PX 


@ 


— 


tx is both the average of X and of X. 


Oy = a4 = standard deviation of X and is called the standard error of 


the mean. 


Example: 

An unknown distribution has a mean of 90 and a standard deviation of 15. 
Samples of size n = 25 are drawn randomly from the population. 
Exercise: 


Problem: 

Find the probability that the sample mean is between 85 and 92. 
Solution: 

Let X = one value from the original unknown population. The 
probability question asks you to find a probability for the sample 


mean. 


Let X = the mean of a sample of size 25. Since wx = 90, 0x = 15, 
andn = 25; 


. “15 
then X N(90, 45.) 


Find P(85 < x < 92) Draw a graph. 


P(85 < a < 92) = 0.6997 


The probability that the sample mean is between 85 and 92 is 0.6997. 


P(8S < x < 92) 


A | 


85 90 92 


TI-83 or 84: normalcdf (lower value, upper value, mean, standard 
error of the mean) 


The parameter list is abbreviated (lower value, upper value, J, Wad 


normalcdf (85,92,90, = ) = 0.6997 
Exercise: 
Problem: 


Find the value that is 2 standard deviations above the expected value 
(it is 90) of the sample mean. 


Solution: 


To find the value that is 2 standard deviations above the expected 
value 90, use the formula 


value = wx + (#o0fSTDEVs) (+) 


n 


: eee 
value = 90 + 2 FE 96 


So, the value that is 2 standard deviations above the expected value is 
96. 


Example: 

The length of time, in hours, it takes an "over 40" group of people to play 
one soccer match is normally distributed with a mean of 2 hours and a 
standard deviation of 0.5 hours. A sample of size n = 50 is drawn 
randomly from the population. 

Exercise: 


Problem: 


Find the probability that the sample mean is between 1.8 hours and 
2.3 hours. 


Solution: 
Let X = the time, in hours, it takes to play one soccer match. 


The probability question asks you to find a probability for the sample 
mean time, in hours, it takes to play one soccer match. 


Let X = the mean time, in hours, it takes to play one soccer match. 


oie os a Sandi? — , then 
X ~N( ; ) by the Central Limit Theorem for Means. 


ix —2,0,% — 0.5, 77 — 50, and X-N(2, 45) 
Findbe eS 253), Draw a graph. 
GL 2a 2 AS) SO 


pee eee 
normalcdf(1.8,2.3,2, a5) — 0.9977 


The probability that the mean time is between 1.8 hours and 2.3 hours 
is 


Glossary 


Average 
A number that describes the central tendency of the data. There are a 
number of specialized averages, including the arithmetic mean, 
weighted mean, median, mode, and geometric mean. 


Central Limit Theorem 
Given a random variable (RV) with known mean pu and known 
standard deviation o. We are sampling with size n and we are 
interested in two new RVs - the sample mean, X, and the sample sum, 
3/X. If the size n of the sample is sufficiently large, then X ~ 


N(n, <) and XX ~ N (au, Jno). If the size n of the sample is 


sufficiently large, then the distribution of the sample means and the 
distribution of the sample sums will approximate a normal distribution 
regardless of the shape of the population. The mean of the sample 
means will equal the population mean and the mean of the sample 
sums will equal n times the population mean. The standard deviation 
of the distribution of the sample means, Ta is called the standard 


error of the mean. 


Normal Distribution 
A continuous random variable (RV) with pdf 


Ex) = rs e~(*-H)"/20° | where jy. is the mean of the distribution and 


o is the standard deviation. Notation: X ~ N(y,o). If uw = 0 and 
ao = 1, the RV is called the standard normal distribution. 


Standard Error of the Mean 


The standard deviation of the distribution of the sample means, — 


ane 


9.3 Central Limit Theorem: Using the Central Limit Theorem TCC 

Central Limit Theorem: Using the Central Limit Theorem is part of the 
collection col10555 written by Barbara Illowsky and Susan Dean. It covers 
how and when to use the Central Limit Theorem and has contributions from 
Roberta Bloom. 


It is important for you to understand when to use the CLT. If you are being 
asked to find the probability of the mean, use the CLT for the mean. If you 
are being asked to find the probability of a sum or total, use the CLT for 
sums. This also applies to percentiles for means and sums. 


Note:If you are being asked to find the probability of an individual value, 
do not use the CLT. Use the distribution of its random variable. 


Examples of the Central Limit Theorem 
Law of Large Numbers 


The Law of Large Numbers says that if you take samples of larger and 
larger size from any population, then the mean x of the sample tends to get 
closer and closer to pw. From the Central Limit Theorem, we know that as n 
gets larger and larger, the sample means follow a normal distribution. The 
larger n gets, the smaller the standard deviation gets. (Remember that the 


standard deviation for X is Va .) This means that the sample mean z must 
be close to the population mean jz. We can say that yz is the value that the 
sample means approach as n gets larger. The Central Limit Theorem 


illustrates the Law of Large Numbers. 


Central Limit Theorem for the Mean and Sum Examples 


Example: 


A study involving stress is done on a college campus among the students. 
The stress scores follow a uniform distribution with the lowest stress 
score equal to 1 and the highest equal to 5. Using a sample of 75 students, 
find: 


1. The probability that the mean stress score for the 75 students is less 
than 2. 
2. The 90th percentile for the mean stress score for the 75 students. 


Let X = one stress score. 

Problems 1. and 2. ask you to find a probability or a percentile for a mean. 
Problems 3 and 4 ask you to find a probability or a percentile for a total or 
sum. The sample size, n, is equal to 75. 

Since the individual stress scores follow a uniform distribution, X ~ 
U(1,5) where a = 1 and b = 5 (See Continuous Random Variables for the 


uniform). 
— atb 145 _ 
eS eee 


For problems 1. and 2., let _X = the mean stress score for the 75 students. 
Then, 


S 115 = 

x N(3, 115 ) where n = 75. 

Exercise: 
Problem: Find P(x < 2). Draw the graph. 
Solution: 
P(x <2)=0 


The probability that the mean stress score is less than 2 is about 0. 


p(x < 2) 


normalcdf Woe 118 | = i 
V75 


Note:The smallest stress score is 1. Therefore, the smallest mean for 
75 stress scores is 1. 


Exercise: 


Problem: 


Find the 90th percentile for the mean of 75 stress scores. Draw a 
graph. 


Solution: 
Let k = the 90th precentile. 
Find k where P(x < k) = 0.90. 


oS a 


P(x <k)= 0.90 


x 


The 90th percentile for the mean of 75 scores is about 3.2. This tells 
us that 90% of all the means of 75 stress scores are at most 3.2 and 
10% are at least 3.2. 


invNorm (.90, 3, 138.) = 3.2 
V75 


For problems c and d, let 3:X = the sum of the 75 stress scores. Then, 1X 


~ w|(75) : (2), V75- 1.15] 


Example: 

Suppose that a market research analyst for a cell phone company conducts 
a study of their customers who exceed the time allowance included on their 
basic cell phone contract; the analyst finds that for those people who 
exceed the time included in their basic contract, the excess time used 
follows a left skewed distribution with a mean of 22 minutes and a 
standard deviation of 8 minutes. 

Consider a random sample of 80 customers who exceed the time allowance 
included in their basic cell phone contract. 

Let X = the excess time used by one INDIVIDUAL cell phone customer 
who exceeds his contracted time allowance. 

Let X = the mean excess time used by a sample of n = 80 customers who 
exceed their contracted time allowance. 


XxX ~ N (22, +.) by the CLT for Sample Means 


Exercise: 
Problem: 
Using the CLT to find Probability: 


aFind the probability that the mean excess time used by the 80 
customers in the sample is longer than 20 minutes. This is asking 
us to find P(a > 20) Draw the graph. 


Solution: 


Part a. 
ind: ee 20) 


P(x > 20) = 0.9873 using normalcdf (20, 1E99, 22, s.) 


The probability is 0.9873 that the mean excess time used is more than 
20 minutes, for a sample of 80 customers who exceed their contracted 
time allowance. 


p(x > 20) 


| 


20 22 


Note:1E99 = 10°%and-1E99 — —10°. Press the 
EE 


key for E. Or just use 10499 instead of 1E99. 


Exercise: 


Problem: 


Using the CLT to find Percentiles: 

Find the 95th percentile for the sample mean excess time for samples 
of 80 customers who exceed their basic contract time allowances. 
Draw a graph. 


Solution: 


Let k = the 95th percentile. Find k where P(x < k) = 0.95 


_ a 3s )\_ 
k = 23.47 using invNorm(.95, 22, +.) — 23.47 


P(x <k)= 0.95 


= 
22 k 


The 95th percentile for the sample mean excess time used is about 
23.47 minutes for random samples of 80 customers who exceed their 
contractual allowed time. 


95% of such samples would have means under 23.47 minutes; only 
5% of such samples would have means above 23.47 minutes. 


Note:(HISTORICAL): Normal Approximation to the Binomial 


Historically, being able to compute binomial probabilities was one of the 
most important applications of the Central Limit Theorem. Binomial 
probabilities were displayed in a table in a book with a small value for n 
(say, 20). To calculate the probabilities with large values of n, you had to 
use the binomial formula which could be very complicated. Using the 
Normal Approximation to the Binomial simplified the process. To 
compute the Normal Approximation to the Binomial, take a simple random 
sample from a population. You must meet the conditions for a binomial 
distribution: 


e there are a certain number n of independent trials 
e the outcomes of any trial are success or failure 
e each trial has the same probability of a success p 


Recall that if X is the binomial random variable, then X~B(n, p). The 
shape of the binomial distribution needs to be similar to the shape of the 
normal distribution. To ensure this, the quantities np and nq must both be 
greater than five (np > 5 and nq > 5; the approximation is better if they 
are both greater than or equal to 10). Then the binomial can be 
approximated by the normal distribution with mean 4 = np and standard 
deviation o = ,/npq. Remember that q = 1 — p. In order to get the best 
approximation, add 0.5 to x or subtract 0.5 from x (use x + 0.5 or x — 0.5 
). The number 0.5 is called the continuity correction factor. 


**Contributions made to Example 2 by Roberta Bloom 


Glossary 


Average 
A number that describes the central tendency of the data. There are a 
number of specialized averages, including the arithmetic mean, 
weighted mean, median, mode, and geometric mean. 


Central Limit Theorem 
Given a random variable (RV) with known mean p and known 
standard deviation o. We are sampling with size n and we are 
interested in two new RVs - the sample mean, X, and the sample sum, 
3/X. If the size n of the sample is sufficiently large, then X ~ 


N(n, 2) and 7X ~ N(np, no). If the size n of the sample is 


sufficiently large, then the distribution of the sample means and the 
distribution of the sample sums will approximate a normal distribution 
regardless of the shape of the population. The mean of the sample 
means will equal the population mean and the mean of the sample 
sums will equal n times the population mean. The standard deviation 
of the distribution of the sample means, Gm is called the standard 


error of the mean. 


Exponential Distribution 
A continuous random variable (RV) that appears when we are 
interested in the intervals of time between some random events, for 
example, the length of time between emergency arrivals at a hospital. 
Notation: X~Exp(m). The mean is ps = —- and the standard deviation 


iso = +. The probability density function is f(z) = me ™, x > 0 
and the cumulative distribution function is P(X < #) =1—e"™. 


Mean 
A number that measures the central tendency. A common name for 
mean is ‘average.’ The term 'mean' is a shortened form of ‘arithmetic 


mean.’ By definition, the mean for a sample (denoted by 2) is 


Sum of all values in the sample dth f lati 
Number of values in the sample ’ en € mean for a population 


Sum of all values in the population 
Number of values in the population ° 


i 


(denoted by pz) is w = 


Uniform Distribution 
A continuous random variable (RV) that has equally likely outcomes 
over the domain, a < x < b. Often referred as the Rectangular 
distribution because the graph of the pdf has the form of a rectangle. 
Notation: X~U(a,b). The mean is up = ate and the standard deviation 


iso = / oe Y The probability density function is f(a) = = for 


a<2a< bora<z< Db. The cumulative distribution is 


P(X <2) = =. 


9.4 Central Limit Theorem: Practice TCC 


Student Learning Outcomes 


e The student will calculate probabilities using the Central Limit 
Theorem. 


Given 


Yoonie is a personnel manager in a large corporation. Each month she must 
review 16 of the employees. From past experience, she has found that the 
reviews take her approximately 4 hours each to do with a population 
standard deviation of 1.2 hours. Let be the random variable representing 
the time it takes her to complete one review. Assume __ is normally 
distributed. Let be the random variable representing the mean time to 
complete the 16 reviews. Let be the total time it takes Yoonie to 
complete all of the month’s reviews. Assume that the 16 reviews represent a 
random set of reviews. 


Distribution 


Complete the distributions. 


i a 
oy oe 
Graphing Probability 


For each problem below: 


e a Sketch the graph. Label and scale the horizontal axis. Shade the 
region corresponding to the probability. 
e b Calculate the value. 


Exercise: 


Problem: 


Find the probability that one review will take Yoonie from 3.5 to 4.25 
hours. 


eda 


Solution: 
e b3.5, 4.25, 0.2441 
Exercise: 
Problem: 


Find the probability that the mean of a month’s reviews will take 
Yoonie from 3.5 to 4.25 hrs. 


ea 


4 | 


e b 


Solution: 
e b 0.7499 
Exercise: 
Problem: 


Find the 95th percentile for the mean time to complete one month’s 
reviews. 


ea 


i | 


e bThe 95th Percentile= 


Solution: 


e b 4.49 hours 


Discussion Question 


Exercise: 


Problem: What causes the probabilities in [link] and [link] to differ? 


9.5 Central Limit Theorem: Homework TCC 

The Central Limit Theorem: Homework is part of the collection col10555 
written by Barbara Illowsky and Susan Dean. 

Exercise: 


Problem: 


X ~ N(60,9). Suppose that you form random samples of 25 from this 

distribution. Let X be the random variable of averages. Let »X be the 
random variable of sums. For c - f, sketch the graph, shade the region, 

label and scale the horizontal axis for X,, and find the probability. 


e a Sketch the distributions of X and X on the same graph. 
ebxX~ 

e c P(x < 60) = 

e d Find the 30th percentile for the mean. 

e e P(56 < ax < 62) = 

‘FPS <@< 53) = 


Solution: 


¢ b Xbar~N(60,-~) 


, /25 
c0.5000 
d59.06 
e0.8536 
£0.1333 
¢ h1530.35 
e 10.8536 


Exercise: 


Problem: 


Determine which of the following are true and which are false. Then, 
in complete sentences, justify your answers. 


e a When the sample size is large, the mean of X is approximately 
equal to the mean of X. 

¢ b When the sample size is large, X is approximately normally 
distributed. 

e c When the sample size is large, the standard deviation of X is 
approximately the same as the standard deviation of X. 


Exercise: 


Problem: 


The percent of fat calories that a person in America consumes each 
day is normally distributed with a mean of about 36 and a standard 
deviation of about 10. Suppose that 16 individuals are randomly 
chosen. 


Let X =average percent of fat calories. 


eaX~- ( : 

¢ b For the group of 16, find the probability that the average 
percent of fat calories consumed is more than 5. Graph the 
situation and shade in the area to be determined. 

e c Find the first quartile for the average percent of fat calories. 


Solution: 
: Bue 
a N(36, rg) 
e bil 
e € 34.31 


Exercise: 


Problem: 


Previously, De Anza statistics students estimated that the amount of 
change daytime statistics students carry is exponentially distributed 
with a mean of $0.88. Suppose that we randomly pick 25 daytime 
Statistics students. 


a In words, X = 

b X~ 

c In words, X = 

dX ~ ( ; 
e Find the probability that an individual had between $0.80 and 
$1.00. Graph the situation and shade in the area to be determined. 
f Find the probability that the average of the 25 students was 
between $0.80 and $1.00. Graph the situation and shade in the 
area to be determined. 

g Explain the why there is a difference in (e) and (f). 


Exercise: 


Problem: 


Suppose that the distance of fly balls hit to the outfield (in baseball) is 
normally distributed with a mean of 250 feet and a standard deviation 
of 50 feet. We randomly sample 49 fly balls. 


a If X = average distance in feet for 49 fly balls, then X ~ 


b What is the probability that the 49 balls traveled an average of 


less than 240 feet? Sketch the graph. Scale the horizontal axis for 
X. Shade the region corresponding to the probability. Find the 
probability. 

c Find the 80th percentile of the distribution of the average of 49 
fly balls. 


Solution: 


: 50 
a N(250,) 


e b 0.0808 
e 256.01 feet 


Exercise: 


Problem: 


According to the Internal Revenue Service, the average length of time 
for an individual to complete (record keep, learn, prepare, copy, 
assemble and send) IRS Form 1040 is 10.53 hours (without any 
attached schedules). The distribution is unknown. Let us assume that 
the standard deviation is 2 hours. Suppose we randomly sample 36 
taxpayers. 


e alin words, X = 

¢ bIn words, X = 

© cX-~- 

e d Would you be surprised if the 36 taxpayers finished their Form 
1040s in an average of more than 12 hours? Explain why or why 
not in complete sentences. 

e e Would you be surprised if one taxpayer finished his Form 1040 
in more than 12 hours? In a complete sentence, explain why. 


Exercise: 
Problem: 
Suppose that a category of world class runners are known to run a 


marathon (26 miles) in an average of 145 minutes with a standard 
deviation of 14 minutes. Consider 49 of the races. 


Let X = the average of the 49 races. 
eaX~- 


e b Find the probability that the runner will average between 142 
and 146 minutes in these 49 marathons. 


e c Find the 80th percentile for the average of these 49 marathons. 
e d Find the median of the average running times. 


Solution: 


: 14 
a N(145,—-=) 


e b 0.6247 
e c 146.68 
e d 145 minutes 


Exercise: 


Problem: 


The attention span of a two year-old is exponentially distributed with a 
mean of about 8 minutes. Suppose we randomly survey 60 two year- 
olds. 


e aln words, X = 

e bxX-~ 

e cIn words, X = 

ed X~ 

e e Before doing any calculations, which do you think will be 
higher? Explain why. 


© ithe probability that an individual attention span is less than 
10 minutes; or 

o ii the probability that the average attention span for the 60 
children is less than 10 minutes? Why? 


e f Calculate the probabilities in part (e). 
e g Explain why the distribution for X is not exponential. 


Exercise: 


Problem: 


The length of songs in a collector’s CD collection is uniformly 
distributed from 2 to 3.5 minutes. Suppose we randomly pick 5 CDs 
from the collection. There is a total of 43 songs on the 5 CDs. 


e alnwords, X = 

eb X- 

e clIn words, X = 

ed X-~- 

e e Find the first quartile for the average song length. 

e f The IQR (interquartile range) for the average song length is 
from to 


Exercise: 


Problem: 


Salaries for teachers in a particular elementary school district are 
normally distributed with a mean of $44,000 and a standard deviation 
of $6500. We randomly survey 10 teachers from that district. 


e alin words, X = 

¢ bIn words, X = 

© cX-~- 

e f Find the probability that the teachers earn a total of over 
$400,000. 

e g Find the 90th percentile for an individual teacher’s salary. 

e h Find the 90th percentile for the average teachers’ salary. 

e ilf we surveyed 70 teachers instead of 10, graphically, how 
would that change the distribution for X ? 

e j If each of the 70 teachers received a $3000 raise, graphically, 
how would that change the distribution for X ? 


Solution: 


are (44,000, “7 ) 

e N(440,000,(/10)(6500)) 
° £0.9742 

¢ g $52,330 

h $46,634 


Exercise: 


Problem: 


The distribution of income in some Third World countries is 
considered wedge shaped (many very poor people, very few middle 
income people, and few to many wealthy people). Suppose we pick a 
country with a wedge distribution. Let the average salary be $2000 per 
year with a standard deviation of $8000. We randomly survey 1000 
residents of that country. 


¢ alin words, X = 

e bIn words, X = 

ecX-~- 

e d How is it possible for the standard deviation to be greater than 
the average? 

e Why is it more likely that the average of the 1000 residents will 
be from $2000 to $2100 than from $2100 to $2200? 


Exercise: 


Problem: 


The average length of a maternity stay in a U.S. hospital is said to be 
2.4 days with a standard deviation of 0.9 days. We randomly survey 80 
women who recently bore children in a U.S. hospital. 


¢ alnwords, X = 
e bIn words, X = 
ecX~- 


e f Is it likely that an individual stayed more than 5 days in the 
hospital? Why or why not? 

e g Is it likely that the average stay for the 80 women was more 
than 5 days? Why or why not? 

e h Which is more likely: 


© jan individual stayed more than 5 days; or 
o jithe average stay of 80 women was more than 5 days? 


Solution: 
: 0.9 
c N(2.4, Ja 
e eN(192,8.05) 
e hindividual 


Exercise: 


Problem: 


In 1940 the average size of a U.S. farm was 174 acres. Let’s say that 
the standard deviation was 55 acres. Suppose we randomly survey 38 
farmers from 1940. (Source: U.S. Dept. of Agriculture) 


e aln words, X = 

¢ bIn words, X = 

ecxX~ 

d The IQR for X is from acres to acres. 


Try these multiple choice questions (Exercises19 - 23). 


The next two questions refer to the following information: The time to 
wait for a particular rural bus is distributed uniformly from 0 to 75 minutes. 
100 riders are randomly sampled to learn how long they waited. 

Exercise: 


Problem: 


The 90th percentile sample average wait time (in minutes) for a sample 
of 100 riders is: 


e A 315.0 
¢ B40.3 
e C 38.5 
e D 65.2 


Solution: 


B 
Exercise: 


Problem: 


Would you be surprised, based upon numerical calculations, if the 
sample average wait time (in minutes) for 100 riders was less than 30 
minutes? 


e A Yes 
e BNo 
e C There is not enough information. 


Solution: 


A 
Exercise: 


Problem: 


Which of the following is NOT TRUE about the distribution for 
averages? 


e A The mean, median and mode are equal 


e B The area under the curve is one 
e C The curve never touches the x-axis 
e D The curve is skewed to the right 


Solution: 


D 


The next three questions refer to the following information: The cost of 
unleaded gasoline in the Bay Area once followed an unknown distribution 
with a mean of $4.59 and a standard deviation of $0.10. Sixteen gas stations 
from the Bay Area are randomly chosen. We are interested in the average 
cost of gasoline for the 16 gas stations. 

Exercise: 


Problem: 


The distribution to use for the average cost of gasoline for the 16 gas 
Stations is 


° AX ~N(4.59, 0.10) 
“Boe ~ N(4.59, 220.) 


© CX ~N(4.59, St) 
© DX ~N(4.59, >4) 
Solution: 
B 
Exercise: 
Problem: 


What is the probability that the average price for 16 gas stations is over 
$4.69? 


A Almost zero 
B 0.1587 

C 0.0943 

D Unknown 


Solution: 


A 
Exercise: 
Problem: 


Find the probability that the average price for 30 gas stations is less 
than $4.55. 


e A0.6554 
e B0.3446 
e €0.0142 
e D0.9858 
e E0 


Solution: 


C 
Exercise: 


Problem: 


For the Charter School Problem (Example 6) in Central Limit 
Theorem: Using the Central Limit Theorem, calculate the following 
using the normal approximation to the binomial. 


e A Find the probability that less than 100 favor a charter school for 
grades K - 5. 

e B Find the probability that 170 or more favor a charter school for 
grades K - 5. 


e C Find the probability that no more than 140 favor a charter 
school for grades K - 5. 

e D Find the probability that there are fewer than 130 that favor a 
charter school for grades K - 5. 

e E Find the probability that exactly 150 favor a charter school for 
grades K - 5. 


If you either have access to an appropriate calculator or computer 
software, try calculating these probabilities using the technology. Try 
also using the suggestion that is at the bottom of Central Limit 
Theorem: Using the Central Limit Theorem for finding a website 
that calculates binomial probabilities. 


Solution: 


e C 0.0162 
e E 0.0268 


Exercise: 


Problem: 


Four friends, Janice, Barbara, Kathy and Roberta, decided to carpool 
together to get to school. Each day the driver would be chosen by 
randomly selecting one of the four names. They carpool to school for 
96 days. Use the normal approximation to the binomial to calculate the 
following probabilities. Round the standard deviation to 4 decimal 
places. 


e A Find the probability that Janice is the driver at most 20 days. 

e B Find the probability that Roberta is the driver more than 16 
days. 

e C Find the probability that Barbara drives exactly 24 of those 96 
days. 


If you either have access to an appropriate calculator or computer 
software, try calculating these probabilities using the technology. Try 
also using the suggestion that is at the bottom of Central Limit 


Theorem: Using the Central Limit Theorem for finding a website 
that calculates binomial probabilities. 


Solution: 


e A 0.2047 
e B0.9615 
« C 0.0938 


**Fxercise 24 contributed by Roberta Bloom 


